AAAI.2021 - Student Abstract and Poster Program

Total: 105

#1 Role of Optimizer on Network Fine-tuning for Adversarial Robustness (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Akshay Agarwal ; Mayank Vatsa ; Richa Singh

The solutions proposed in the literature for adversarial robustness are either not effective against the challenging gradient-based attacks or are computationally demanding, such as adversarial training. Adversarial training or network training based data augmentation shows the potential to increase the adversarial robustness. While the training seems compelling, it is not feasible for resource-constrained institutions, especially academia, to train the network from scratch multiple times. The two fold contributions are: (i) providing an effective solution against white-box adversarial attacks via network fine-tuning steps and (ii) observing the role of different optimizers towards robustness. Extensive experiments are performed on a range of databases, including Fashion-MNIST and a subset of ImageNet. It is found that the few steps of network fine-tuning effectively increases the robustness of both shallow and deep architectures. To know other interesting observations, especially regarding the role of the optimizer, refer to the paper.

#2 A Serverless Approach to Federated Learning Infrastructure Oriented for IoT/Edge Data Sources (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Anshul Ahuja ; Geetesh Gupta ; Suman Kundu

The paper proposes a Serverless and Mobile relay based architecture for a highly scalable Federated Learning system for low power IoT and Edge Devices. The aim is an easily deployable infrastructure on a public cloud platform by the end user and democratize the use of federated learning.

#3 Reward based Hebbian Learning in Direct Feedback Alignment (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Ashlesha Akella ; Sai Kalyan Ranga Singanamalla ; Chin-Teng Lin

Imparting biological realism during the learning process is gaining attention towards producing computationally efficient algorithms without compromising the performance. Feedback alignment and mirror neuron concept are two such approaches where the feedback weight remains static in the former and update via Hebbian learning in the later. Though these approaches have proven to work efficiently for supervised learning, it remained unknown if the same can be applicable to reinforcement learning applications. Therefore, this study introduces RHebb-DFA where the reward-based Hebbian learning is used to update feedback weights in direct feedback alignment mode. This approach is validated on various Atari games and obtained equivalent performance in comparison with DDQN.

#4 Clustering Partial Lexicographic Preference Trees (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Joseph Allen ; Xudong Liu ; Karthikeyan Umapathy ; Sandeep Reddivari

In this work, we consider distance-based clustering of partial lexicographic preference trees (PLP-trees), intuitive and compact graphical representations of user preferences over multi-valued attributes. To compute distances between PLP-trees, we propose a polynomial time algorithm that computes Kendall's Tau distance directly from the trees and show its efficacy compared to the brute-force algorithm. To this end, we implement several clustering methods (i.e., spectral clustering, affinity propagation, and agglomerative nesting) augmented by our distance algorithm, experiment with clustering of up to 10,000 PLP-trees, and show the effectiveness of the clustering methods and visualizations of their results.

#5 Logic Guided Genetic Algorithms (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Dhananjay Ashok ; Joseph Scott ; Sebastian J. Wetzel ; Maysum Panju ; Vijay Ganesh

We present a novel Auxiliary Truth enhanced Genetic Algorithm (GA) that uses logical or mathematical constraints as a means of data augmentation as well as to compute loss (in conjunction with the traditional MSE), with the aim of increasing both data efficiency and accuracy of symbolic regression (SR) algorithms. Our method, logic-guided genetic algorithm (LGGA), takes as input a set of labelled data points and auxiliary truths (AT) (mathematical facts known a priori about the unknown function the regressor aims to learn) and outputs a specially generated and curated dataset that can be used with any SR method. We evaluate LGGA against state-of-the-art SR tools, namely, Eureqa and TuringBot and find that using these SR tools in conjunction with LGGA results in them solving up to 30% more equations, needing only a fraction of the amount of data compared to the same tool without LGGA, i.e., resulting in up to a 61.9% improvement in data efficiency.

#6 Responsible Prediction Making of COVID-19 Mortality (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Hubert Baniecki ; Przemyslaw Biecek

For high-stakes prediction making, the Responsible Artificial Intelligence (RAI) is more important than ever. It builds upon Explainable Artificial Intelligence (XAI) to advance the efforts in providing fairness, model explainability, and accountability to the AI systems. During the literature review of COVID-19 related prognosis and diagnosis, we found out that most of the predictive models are not faithful to the RAI principles, which can lead to biassed results and wrong reasoning. To solve this problem, we show how novel XAI techniques boost transparency, reproducibility and quality of models.

#7 Encoding Temporal and Spatial Vessel Context using Self-Supervised Learning Model (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Pierre Bernabé ; Helge Spieker ; Bruno Legeard ; Arnaud Gotlieb

Maritime surveillance is essential to avoid illegal activities and for environmental protection. However, the unlabeled, noisy, irregular time-series data and the large area to be covered make it challenging to detect illegal activities. Existing solutions focus only on trajectory reconstruction and probabilistic models that do ignore the context, such as the neighboring vessels. We propose a novel representation learning method that considers both temporal and spatial contexts learned in a self-supervised manner, using a selection of pretext tasks that do not require to be labeled manually. The underlying model encodes the representation of maritime vessel data compactly and effectively. This generic encoder can then be used as input for more complex tasks lacking labeled data.

#8 Unsupervised Causal Knowledge Extraction from Text using Natural Language Inference (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Manik Bhandari ; Mark Feblowitz ; Oktie Hassanzadeh ; Kavitha Srinivas ; Shirin Sohrabi

In this paper, we address the problem of extracting causal knowledge from text documents in a weakly supervised manner. We target use cases in decision support and risk management, where causes and effects are general phrases without any constraints. We present a method called CaKNowLI which only takes as input the text corpus and extracts a high-quality collection of cause-effect pairs in an automated way. We approach this problem using state-of-the-art natural language understanding techniques based on pre-trained neural models for Natural Language Inference (NLI). Finally, we evaluate the proposed method on existing and new benchmark data sets.

#9 Early Prediction of Children’s Task Completion in a Tablet Tutor using Visual Features (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Bikram Boote ; Mansi Agarwal ; Jack Mostow

Intelligent tutoring systems could benefit from human teachers’ ability to monitor students’ affective states by watching them and thereby detecting early warning signs of disengagement in time to prevent it. Toward that goal, this paper describes a method that uses input from a tablet tutor’s user-facing camera to predict whether the student will complete the current activity or disengage from it. Training a disengagement predictor is useful not only in itself but also in identifying visual indicators of negative affective states even when they don’t lead to non-completion of the task. Unlike prior work that relied on tutor-specific features, the method relies solely on visual features and so could potentially apply to other tutors. We present a deep learning method to make such predictions based on a Long Short Term Memory (LSTM) model that uses a target replication loss function. We train and test the model on screen capture videos of children in Tanzania using a tablet tutor to learn basic Swahili literacy and numeracy. We achieve balanced-class-size prediction accuracy of 73.3% when 40% of the activity is still left.

#10 Fair Stable Matchings Under Correlated Preferences (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Angelina Brilliantova ; Hadi Hosseini

Stable matching models are widely used in market design, school admission, and donor organ exchange. The classic Deferred Acceptance (DA) algorithm guarantees a stable matching that is optimal for one side (say men) and pessimal for the other (say women). A sex-equal stable matching aims at providing a fair solution to this problem. We demonstrate that under a class of correlated preferences, the DA algorithm either returns a sex-equal solution or has a very low sex-equality cost.

#11 BOSS: A Bi-directional Search Technique for Optimal Coalition Structure Generation with Minimal Overlapping (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Narayan Changder ; Samir Aknine ; Sarvapali D. Ramchurn ; Animesh Dutta

In this paper, we focus on the Coalition Structure Generation (CSG) problem, which involves finding exhaustive and disjoint partitions of agents such that the efficiency of the entire system is optimized. We propose an efficient hybrid algorithm for optimal coalition structure generation called BOSS. When compared to the state-of-the-art, BOSS is shown to perform better by up to 33.63% on benchmark inputs. The maximum time gain by BOSS is 3392 seconds for 27 agents.

#12 NEAP-F: Network Epoch Accuracy Prediction Framework (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Arushi Chauhan ; Mayank Vatsa ; Richa Singh

Recent work in neural architecture search has spawned interest in algorithms that can predict the performance of convolutional neural networks using minimum time and computation resources. We propose a new framework, Network Epoch Accuracy Prediction Framework (NEAP-F) which can predict the testing accuracy achieved by a convolutional neural network in one or more epochs. We introduce a novel approach to generate vector representations for networks, and encode ``ease" of classifying image datasets into a vector. For vector representations of networks, we focus on the layer parameters and connections between the network layers. A network achieves different accuracies on different image datasets; therefore, we use the image dataset characteristics to create a vector signifying the ``ease" of classifying the image dataset. After generating these vectors, the prediction models are trained with architectures having skip connections seen in current state-of-the-art architectures. The framework predicts accuracies in order of milliseconds, demonstrating its computational efficiency. It can be easily applied to neural architecture search methods to predict the performance of candidate networks and can work on unseen datasets as well.

#13 Robotic Manipulation with Reinforcement Learning, State Representation Learning, and Imitation Learning (Student Abstract) [PDF] [Copy] [Kimi]

Author: Hanxiao Chen

Humans possess the advanced ability to grab, hold, and manipulate objects with dexterous hands. What about robots? Can they interact with the surrounding world intelligently to achieve certain goals (e.g., grasping, object-relocation)? Actually, robotic manipulation is central to achieving the premise of robotics and represents immense potential to be widely applied in various scenarios like industries, hospitals, and homes. In this work, we aim to address multiple robotic manipulation tasks like grasping, button-pushing, and door-opening with reinforcement learning (RL), state representation learning (SRL), and imitation learning. For diverse missions, we self-built the PyBullet or MuJoCo simulated environments and independently explored three different learning-style methods to successfully solve such tasks: (1) Normal reinforcement learning methods; (2) Combined state representation learning (SRL) and RL approaches; (3) Imitation learning bootstrapped RL algorithms.

#14 Multi-modal User Intent Classification Under the Scenario of Smart Factory (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Yu-Ching Chiu ; Bo-Hao Chang ; Tzu-Yu Chen ; Cheng-Fu Yang ; Nanyi Bi ; Richard Tzong-Han Tsai ; Hung-yi Lee ; Jane Yung-jen Hsu

Question-answering systems are becoming increasingly popular in Natural Language Processing, especially when applied in smart factory settings. A common practice in designing those systems is through intent classification. However, in a multiple-stage task commonly seen in those settings, relying solely on intent classification may lead to erroneous answers, as questions rising from different work stages may share the same intent but have different contexts and therefore require different answers. To address this problem, we designed an interactive dialogue system that utilizes contextual information to assist intent classification in a multiple-stage task. Specifically, our system incorporates user’s utterances with real-time video feed to better situate users’ questions and analyze their intent.

#15 Passive learning of Timed Automata from logs (Student Abstract) [PDF] [Copy] [Kimi]

Author: Lénaïg Cornanguer

We propose a novel algorithm to passively learn deterministic Timed Automata from events sequences associated with the delay occurring between them. This algorithm produces models that are more specific than State-of-the-Art algorithms and that has a better identification of the temporal constraints applying on the systems.

#16 Reducing Neural Network Parameter Initialization Into an SMT Problem (Student Abstract) [PDF] [Copy] [Kimi]

Author: Mohamad H. Danesh

Training a neural network (NN) depends on multiple factors, including but not limited to the initial weights. In this paper, we focus on initializing deep NN parameters such that it performs better, comparing to random or zero initialization. We do this by reducing the process of initialization into an SMT solver. Previous works consider certain activation functions on small NNs, however the studied NN is a deep network with different activation functions. Our experiments show that the proposed approach for parameter initialization achieves better performance comparing to randomly initialized networks.

#17 Incorporating Curiosity into Personalized Ranking for Collaborative Filtering (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Qiqi Ding ; Yi Cai ; Ke Xu ; Huakui Zhang

Curiosity affects users' selections of items, and it motivates them to explore the items regardless of their interests. This phenomenon is particularly common in social networks. However, the existing social-based recommendation methods neglect such feature in social network, and it may cause the accuracy decease in recommendation. What's more, only focusing on simulating the users' preferences can lead to information cocoons. In order to tackle the problem, we propose a novel Curiosity Enhanced Bayesian Personalized Ranking (CBPR) model. Our model makes full use of the theories of psychology to model the users' curiosity aroused when facing different opinions. The experimental results on two public datasets demonstrate the advantages of our CBPR model over the existing models.

#18 Demonstrating the Equivalence of List Based and Aggregate Metrics to Measure the Diversity of Recommendations (Student Abstract) [PDF] [Copy] [Kimi]

Author: Maurizio Ferrari Dacrema

The evaluation of recommender systems is frequently focused on accuracy metrics, but this is only part of the picture. The diversity of recommendations is another important dimension that has received renewed interest in recent years. It is known that accuracy and diversity can be conflicting goals and finding appropriate ways to combine them is still an open research question. Several ways have been proposed to measure the diversity of recommendations and to include its optimization in the loss function used to train the model. Methods optimizing list based diversity suffer from two drawbacks: the high computational cost of the loss function and the lack of an efficient way to optimize them. In this paper we show the equivalence of the list based diversity metrics Hamming and Mean Inter-List diversity to the aggregate diversity metric measured with the Herfindahl index, providing a formulation that allows to compute and optimize them easily.

#19 Improving Aerial Instance Segmentation in the Dark with Self-Supervised Low Light Enhancement (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Prateek Garg ; Murari Mandal ; Pratik Narang

Low light conditions in aerial images adversely affect the performance of several vision based applications. There is a need for methods that can efficiently remove the low light attributes and assist in the performance of key vision tasks. In this work, we propose a new method that is capable of enhancing the low light image in a self-supervised fashion, and sequentially apply detection and segmentation tasks in an end-to-end manner. The proposed method occupies a very small overhead in terms of memory and computational power over the original algorithm and delivers superior results. Additionally, we propose the generation of a new low light aerial dataset using GANs, which can be used to evaluate vision based networks for similar adverse conditions.

#20 Detecting Lexical Semantic Change across Corpora with Smooth Manifolds (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Anmol Goel ; Ponnurangam Kumaraguru

Comparing two bodies of text and detecting words with significant lexical semantic shift between them is an important part of digital humanities. Traditional approaches have relied on aligning the different embeddings using the Orthogonal Procrustes problem in the Euclidean space. This study presents a geometric framework that leverages smooth Riemannian manifolds for corpus-specific orthogonal rotations and a corpus-independent scaling metric to project the different vector spaces into a shared latent space. This enables us to capture any affine relationship between the embedding spaces while utilising the rich geometry of smooth manifolds.

#21 Evaluating Meta-Reinforcement Learning through a HVAC Control Benchmark (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Yashvir S. Grewal ; Frits de Nijs ; Sarah Goodwin

Meta-Reinforcement Learning (RL) algorithms promise to leverage prior task experience to quickly learn new unseen tasks. Unfortunately, evaluating meta-RL algorithms is complicated by a lack of suitable benchmarks. In this paper we propose adapting a challenging real-world heating, ventilation and air-conditioning (HVAC) control benchmark for meta-RL. Unlike existing benchmark problems, HVAC control has a broader task distribution, and sources of exogenous stochasticity from price and weather predictions which can be shared across task definitions. This can enable greater differentiation between the performance of current meta-RL approaches, and open the way for future research into algorithms that can adapt to entirely new tasks not sampled from the current task distribution.

#22 RGB-D Scene Recognition based on Object-Scene Relation (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Yuhui Guo ; Xun Liang

We develop a RGB-D scene recognition model based on object-scene relation(RSBR). First learning a Semantic Network in the semantic domain that classifies the label of a scene on the basis of the labels of all object types. Then, we design an Appearance Network in the appearance domain that recognizes the scene according to local captions. We enforce the Semantic Network to guide the Appearance Network in the learning procedure. Based on the proposed RSBR model, we obtain the state-of-the-art results of RGB-D scene recognition on SUN RGB-D and NYUD2 datasets.

#23 Global Fusion Attention for Vision and Language Understanding (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Zixin Guo ; Chen Liang ; Ziyu Wan ; Yang Bai

We extend the popular transformer architecture to a multi-modal model, processing both visual and textual inputs. We propose a new attention mechanism on Transformer-based architecture for the joint vision and language understanding tasks. Our model fuses multi-level comprehension between images and texts in a weighted manner, which could better curve the internal relationships. Experiments on benchmark VQA dataset CLEVR demonstrate the effectiveness of the proposed attention mechanism. We also observe the improvements in sample efficiency of reinforcement learning through the experiments on grounded language understanding tasks of BabyAI platform.

#24 Text Embedding Bank for Detailed Image Paragraph Captioning [PDF] [Copy] [Kimi]

Authors: Arjun Gupta ; Zengming Shen ; Thomas Huang

Existing deep learning-based models for image captioning typically consist of an image encoder to extract visual features and a language model decoder, an architecture that has shown promising results in single high-level sentence generation. However, only the word-level guiding signal is available when the image encoder is optimized to extract visual features. The inconsistency between the parallel extraction of visual features and sequential text supervision limits its success when the length of the generated text is long (more than 50 words). We propose a new module, called the Text Embedding Bank (TEB), to address this problem for image paragraph captioning. This module uses the paragraph vector model to learn fixed-length feature representations from a variable-length paragraph. We refer to the fixed-length feature as the TEB. This TEB module plays two roles to benefit paragraph captioning performance. First, it acts as a form of global and coherent deep supervision to regularize visual feature extraction in the image encoder. Second, it acts as a distributed memory to provide features of the whole paragraph to the language model, which alleviates the long-term dependency problem. Adding this module to two existing state-of-the-art methods achieves a new state-of-the-art result on the paragraph captioning Stanford Visual Genome dataset.

#25 Reinforcement Based Learning on Classification Task Yields Better Generalization and Adversarial Accuracy (Student Abstract) [PDF] [Copy] [Kimi]

Author: Shashi Kant Gupta

Deep Learning has become interestingly popular in the field of computer vision, mostly attaining near or above human-level performance in various vision tasks. But recent work has also demonstrated that these deep neural networks are very vulnerable to adversarial examples (adversarial examples - inputs to a model which are naturally similar to original data but fools the model in classifying it into a wrong class). In this work, we proposed a novel method to train deep learning models on an image classification task. We used a reward-based optimization function, similar to the vanilla policy gradient method in reinforcement learning to train our model instead of conventional cross-entropy loss. An empirical evaluation on cifar10 dataset showed that our method outperforms the same model architecture trained using cross-entropy loss function (on adversarial training). At the same time, our method generalizes better to the training data with the difference in test accuracy and train accuracy < 2% for most of the time as compared to cross-entropy one, whose difference most of the time remains > 2%.